AITopics | minimax regret

Stochastic Shortest Path with Sparse Adversarial Costs

Neural Information Processing SystemsJun-22-2026, 03:31:21 GMT

We study the adversarial Stochastic Shortest Path (SSP) problem with sparse costs under full-information feedback. In the known transition setting, existing bounds based on Online Mirror Descent (OMD) with negative-entropy regularization scale with?

artificial intelligence, machine learning, sparsity, (18 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science (0.67)

Add feedback

Bandit Phase Retrieval

Tor Lattimore

Neural Information Processing SystemsApr-26-2026, 17:39:52 GMT

Bandits

algorithm, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Bandit Phase Retrieval

Tor Lattimore

Neural Information Processing SystemsApr-26-2026, 17:39:48 GMT

We study a bandit version of phase retrieval where the learner chooses actions (At)nt=1 in the d-dimensional unit ball and the expected reward is hAt,?i2 with? 2 Rd an unknown parameter vector. We prove an upper bound on the minimax cumulative regret in this problem of (d p n), which matches known lower bounds up to logarithmic factors and improves on the best known upper bound by a factor of p d. We also show that the minimax simple regret is (d/ p n) and that this is only achievable by an adaptive algorithm. Our analysis shows that an apparently convincing heuristic for guessing lower bounds can be misleading and that uniform bounds on the information ratio for information-directed sampling [Russo and Van Roy, 2014] are not sufficient for optimal regret.

algorithm, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

46489c17893dfdcf028883202cefd6d1-Paper.pdf

Neural Information Processing SystemsApr-25-2026, 16:35:46 GMT

artificial intelligence, data mining, machine learning, (19 more...)

Neural Information Processing Systems

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.47)
Information Technology > Artificial Intelligence > Machine Learning (0.47)

Add feedback

Sequential Probability Assignment with Contexts: Minimax Regret, Contextual Shtarkov Sums, and Contextual Normalized Maximum Likelihood

Neural Information Processing SystemsMar-18-2026, 14:29:48 GMT

We study the fundamental problem of sequential probability assignment, also known as online learning with logarithmic loss, with respect to an arbitrary, possibly nonparametric hypothesis class. Our goal is to obtain a complexity measure for the hypothesis class that characterizes the minimax regret and to determine a general, minimax optimal algorithm. Notably, the sequential $\ell_{\infty}$ entropy, extensively studied in the literature (Rakhlin and Sridharan, 2015, Bilodeau et al., 2020, Wu et al., 2023), was shown to not characterize minimax regret in general. Inspired by the seminal work of Shtarkov (1987) and Rakhlin, Sridharan, and Tewari (2010), we introduce a novel complexity measure, the \emph{contextual Shtarkov sum}, corresponding to the Shtarkov sum after projection onto a multiary context tree, and show that the worst case log contextual Shtarkov sum equals the minimax regret. Using the contextual Shtarkov sum, we derive the minimax optimal strategy, dubbed \emph{contextual Normalized Maximum Likelihood} (cNML). Our results hold for sequential experts, beyond binary labels, which are settings rarely considered in prior work. To illustrate the utility of this characterization, we provide a short proof of a new regret upper bound in terms of sequential $\ell_{\infty}$ entropy, unifying and sharpening state-of-the-art bounds by Bilodeau et al. (2020) and Wu et al. (2023).

artificial intelligence, name change, proceedings, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)

Add feedback

A Simple and Adaptive Learning Rate for FTRL in Online Learning with Minimax Regret of \Theta(T {2/3}) and its Application to Best-of-Both-Worlds

Neural Information Processing SystemsMar-18-2026, 04:19:51 GMT

Follow-the-Regularized-Leader (FTRL) is a powerful framework for various online learning problems. By designing its regularizer and learning rate to be adaptive to past observations, FTRL is known to work adaptively to various properties of an underlying environment. However, most existing adaptive learning rates are for online learning problems with a minimax regret of $\Theta(\sqrt{T})$ for the number of rounds $T$, and there are only a few studies on adaptive learning rates for problems with a minimax regret of $\Theta(T^{2/3})$, which include several important problems dealing with indirect feedback. To address this limitation, we establish a new adaptive learning rate framework for problems with a minimax regret of $\Theta(T^{2/3})$. Our learning rate is designed by matching the stability, penalty, and bias terms that naturally appear in regret upper bounds for problems with a minimax regret of $\Theta(T^{2/3})$. As applications of this framework, we consider three major problems with a minimax regret of $\Theta(T^{2/3})$: partial monitoring, graph bandits, and multi-armed bandits with paid observations. We show that FTRL with our learning rate and the Tsallis entropy regularizer improves existing Best-of-Both-Worlds (BOBW) regret upper bounds, which achieve simultaneous optimality in the stochastic and adversarial regimes. The resulting learning rate is surprisingly simple compared to the existing learning rates for BOBW algorithms for problems with a minimax regret of $\Theta(T^{2/3})$.

artificial intelligence, machine learning, minimax regret, (7 more...)

Neural Information Processing Systems

Industry: Education > Focused Education > Special Education (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback